Search CORE

94 research outputs found

Finite Sample Bernstein -- von Mises Theorem for Semiparametric Problems

Author: Panov Maxim
Spokoiny Vladimir
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 15/06/2014
Field of study

The classical parametric and semiparametric Bernstein -- von Mises (BvM) results are reconsidered in a non-classical setup allowing finite samples and model misspecification. In the case of a finite dimensional nuisance parameter we obtain an upper bound on the error of Gaussian approximation of the posterior distribution for the target parameter which is explicit in the dimension of the nuisance and target parameters. This helps to identify the so called \emph{critical dimension}

p

of the full parameter for which the BvM result is applicable. In the important i.i.d. case, we show that the condition "

p^{3} / n

is small" is sufficient for BvM result to be valid under general assumptions on the model. We also provide an example of a model with the phase transition effect: the statement of the BvM theorem fails when the dimension

p

approaches

n^{1/3}

. The results are extended to the case of infinite dimensional parameters with the nuisance parameter from a Sobolev class. In particular we show near normality of the posterior if the smoothness parameter

s

exceeds 3/2

arXiv.org e-Print Archive

CiteSeerX

Constructing Graph Node Embeddings via Discrimination of Similarity Distributions

Author: Panov Maxim
Tsepa Stanislav
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/10/2018
Field of study

The problem of unsupervised learning node embeddings in graphs is one of the important directions in modern network science. In this work we propose a novel framework, which is aimed to find embeddings by \textit{discriminating distributions of similarities (DDoS)} between nodes in the graph. The general idea is implemented by maximizing the \textit{earth mover distance} between distributions of decoded similarities of similar and dissimilar nodes. The resulting algorithm generates embeddings which give a state-of-the-art performance in the problem of link prediction in real-world graphs

arXiv.org e-Print Archive

Crossref

Accuracy of Gaussian approximation in nonparametric Bernstein -- von Mises Theorem

Author: Panov Maxim
Spokoiny Vladimir
Publication venue
Publication date: 01/01/2019
Field of study

The prominent Bernstein -- von Mises (BvM) result claims that the posterior distribution after centering by the efficient estimator and standardizing by the square root of the total Fisher information is nearly standard normal. In particular, the prior completely washes out from the asymptotic posterior distribution. This fact is fundamental and justifies the Bayes approach from the frequentist viewpoint. In the nonparametric setup the situation changes dramatically and the impact of prior becomes essential even for the contraction of the posterior; see [vdV2008], [Bo2011], [CaNi2013,CaNi2014] for different models like Gaussian regression or i.i.d. model in different weak topologies. This paper offers another non-asymptotic approach to studying the behavior of the posterior for a special but rather popular and useful class of statistical models and for Gaussian priors. First we derive tight finite sample bounds on posterior contraction in terms of the so called effective dimension of the parameter space. Our main results describe the accuracy of Gaussian approximation of the posterior. In particular, we show that restricting to the class of all centrally symmetric credible sets around pMLE allows to get Gaussian approximation up to order (n^{-1}). We also show that the posterior distribution mimics well the distribution of the penalized maximum likelihood estimator (pMLE) and reduce the question of reliability of credible sets to consistency of the pMLE-based confidence sets. The obtained results are specified for nonparametric log-density estimation and generalized regression

arXiv.org e-Print Archive

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Dropout Strikes Back: Improved Uncertainty Estimation via Diversity Sampling

Author: Fedyanin Kirill
Panov Maxim
Tsymbalov Evgenii
Publication venue
Publication date: 26/06/2020
Field of study

Uncertainty estimation for machine learning models is of high importance in many scenarios such as constructing the confidence intervals for model predictions and detection of out-of-distribution or adversarially generated points. In this work, we show that modifying the sampling distributions for dropout layers in neural networks improves the quality of uncertainty estimation. Our main idea consists of two main steps: computing data-driven correlations between neurons and generating samples, which include maximally diverse neurons. In a series of experiments on simulated and real-world data, we demonstrate that the diversification via determinantal point processes-based sampling achieves state-of-the-art results in uncertainty estimation for regression and classification tasks. An important feature of our approach is that it does not require any modification to the models or training procedures, allowing straightforward application to any deep learning model with dropout layers

arXiv.org e-Print Archive

Scalable Batch Acquisition for Deep Bayesian Active Learning

Author: Kotova Daria
Panov Maxim
Rubashevskii Aleksandr
Publication venue
Publication date: 16/02/2023
Field of study

In deep active learning, it is especially important to choose multiple examples to markup at each step to work efficiently, especially on large datasets. At the same time, existing solutions to this problem in the Bayesian setup, such as BatchBALD, have significant limitations in selecting a large number of examples, associated with the exponential complexity of computing mutual information for joint random variables. We, therefore, present the Large BatchBALD algorithm, which gives a well-grounded approximation to the BatchBALD method that aims to achieve comparable quality while being more computationally efficient. We provide a complexity analysis of the algorithm, showing a reduction in computation time, especially for large batches. Furthermore, we present an extensive set of experimental results on image and text data, both on toy datasets and larger ones such as CIFAR-100.Comment: Accepted to SIAM International Conference on Data Mining 202

arXiv.org e-Print Archive

Selective Nonparametric Regression via Testing

Author: Fishkov Alexander
Noskov Fedor
Panov Maxim
Publication venue
Publication date: 28/09/2023
Field of study

Prediction with the possibility of abstention (or selective prediction) is an important problem for error-critical machine learning applications. While well-studied in the classification setup, selective approaches to regression are much less developed. In this work, we consider the nonparametric heteroskedastic regression problem and develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point. Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor. We prove non-asymptotic bounds on the risk of the resulting estimator and show the existence of several different convergence regimes. Theoretical analysis is illustrated with a series of experiments on simulated and real-world data

arXiv.org e-Print Archive

БИТВА ПРИ КАДЕШЕ: ПАЛЕОГРАФИЧЕСКИЕ ЗАМЕТКИ К ПАПИРУСУ ПЕНТАУРЕТА

Author: Panov Maxim Vjacheslavovich
Publication venue: 'Science and Innovation Center'
Publication date: 01/08/2019
Field of study

Purpose. The “Battle of Kadesh,” a historical literary composition recounting one military engagement that happened during a Ramses’s II conflict (XIII century BCE), is known from several sources. The article is devoted to one of them: the hieratic papyrus of Pentauret. All existing translations into modern languages do not pay due consideration to the specific features of the narration preserved in different versions of the text. Moreover, deciphering the writing of the scribe Pentauret by the earlier researchers is frequently inadequate. Criticized mistakes occurring in the generally accepted publications aim to reveal the need for new edition of the manuscript and initialize the individual translation of this document.Methodology. The part of the investigation involving source study is based on the examination of the Late-Egyptian original documents, namely: hieratic manuscripts and hieroglyphic inscriptions available in the form of the published photos, facsimiles or hand-copies. For a comparative paleographic analysis papyri belonging to various archives with common provenance have been studied.Results. The papyrus of Pentauret is the only document in this group where the date of its composition is recorded. To prove that the text was written on the papyrus during Merneptah’s reign, the reason why the epithet ‘great’ was applied to the chief of Ramesses’s II enemies is discussed. A paleographic analysis of the final text fragment has allowed the mistakes occurring in the earlier hieroglyphic transcriptions authored by the foreign researchers to be improved. Thus, a newly constructed hieroglyphic transcription and Russian translation of the colophon is supplied.Scope of application of the results. The article is intended for professionals in political history and source studies of the ancient world.Цель. «Битва при Кадеше», литературно-историческое произведение об одном из эпизодов войн Рамсеса II (XIII-й век до н.э), известна из нескольких источников. Статья посвящена одному из них: иератическому папирусу Пентаурета. Сложившаяся практика переводов произведения на все современные языки не учитывает особенности изложения материала и варианты высказываний в различных редакциях. При этом почерк писца Пентаурета во многих местах разобран предыдущими издателями папируса недостаточно тщательно. Критика ошибок в общепризнанных публикациях должна показать необходимость его переиздания и подготовки отдельного перевода этой рукописи.Методология. Источниковедческая часть исследования основана на работе с подлинными документами, составленными на новоегипетском языке, а именно: иератическими рукописями и иероглифическими надписями, изданными в виде фотографий, факсимиле и прорисовок. Для сопоставительного палеографического анализа привлечены папирусы из нескольких архивов, и имеющие общее место происхождения.Результаты. Папирус Пентаурета является в этой группе источников единственным документом с зафиксированной датой записи. Для подтверждения датировки временем правления Меринптаха раскрывается причина случайного употребления эпитета «великий» в звании правителя врагов Рамсеса II. Палеографический анализ заключительного отрывка, устранил ошибки в ранее сделанных иероглифических транскрипциях зарубежных издателей, в результате колофон папируса переиздан в авторской иероглифической транскрипции, сопровождающейся переводом на русский язык.Область применения результатов. Статья адресована специалистам по политической истории и источниковедению древнего мира

Publishing House Science and Innovation Center: E-Journals / Научно-инновационный центр

Directory of Open Access Journals